Chromatin Immunoprecipitation Sequencing ◾ 235
Figure 6.12 shows that Poly II localization is centered in the TSS where most peaks are observed.
6.3.7 Peak Annotation
We will continue using R to perform annotation of the peaks called with MACS3 program
that stored the peak information in “*peaks.narrowPeak” files. The peaks represent the
most likely locations of protein–DNA interaction in the genome (the content of “*peaks.
narrowPeak” files is discussed above). The main goal of ChIP-Seq data analysis is to inves-
tigate the biological implications of the epigenomic changes like genomic binding sites of
proteins such as TFs, histones, and Poly II. Annotation of the protein–DNA interaction
sites and functions will provide important information about the biological implications.
Peak annotation is the process of associating the sites identified by the peaks to the genes
and region of the genes affected by the epigenetic change. Most interactions like TFs, ini-
tial localization of Poly II and histones occur in the cis-regulatory site of the gene which
is close to TSS and it includes a promoter, an enhancer, a silencer, insulators, etc., which
play crucial roles in controlling gene expressions in specific cell types, conditions, and
developmental stages. An annotation program annotates ChIP-Seq peaks by associating
these peaks to the closest TSS of a gene, either upstream or downstream. A cis-regulatory
region can also be in distance from the TSS or between the TSSs of two different genes.
We will continue using R Bioconductor packages and “*peaks.narrowPeak” files as
inputs for annotation. The following codes create a list of the sample file names and a label
for each sample as “chip1”, “chip2”, and “chip3” respectively:
bedfiles <- list.files(“vis”, pattern= “.bed”, full.names=T)
bedfiles <- as.list(bedfiles)
names(bedfiles) <- c(“chip1”, “chip2”, “chip3”)
Then, we can assign the database of the known human genes to a variable so that we can
use the annotation information and associate them to the peaks.
FIGURE 6.12 Average profile of ChIP-Seq peaks across.